Search CORE

61 research outputs found

Understanding Events:A Diversity-driven Human-Machine Approach

Author: Inel Oana
Publication venue
Publication date: 09/03/2022
Field of study

Validation Methodology for Expert-Annotated Datasets: Event Annotation Case Study

Author: Aroyo Lora
Inel Oana
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Conference on Language, Data and Knowledge (LDK 2019)
Publication date: 01/01/2019
Field of study

Event detection is still a difficult task due to the complexity and the ambiguity of such entities. On the one hand, we observe a low inter-annotator agreement among experts when annotating events, disregarding the multitude of existing annotation guidelines and their numerous revisions. On the other hand, event extraction systems have a lower measured performance in terms of F1-score compared to other types of entities such as people or locations. In this paper we study the consistency and completeness of expert-annotated datasets for events and time expressions. We propose a data-agnostic validation methodology of such datasets in terms of consistency and completeness. Furthermore, we combine the power of crowds and machines to correct and extend expert-annotated datasets of events. We show the benefit of using crowd-annotated events to train and evaluate a state-of-the-art event extraction system. Our results show that the crowd-annotated events increase the performance of the system by at least 5.3%

VU Research Portal

TU Delft Repository

Dagstuhl Research Online Publication Server

Crowdsourcing StoryLines:Harnessing the Crowd for Causal Relation Annotation

Author: Caselli Tomasso
Inel Oana
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Proceedings - University of Groningen

CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement

Author: Aroyo Lora
Dumitrache Anca
Inel Oana
Timmermans Benjamin
Welty Chris
Publication venue
Publication date: 01/01/2018
Field of study

Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system -- worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core

arXiv.org e-Print Archive

VU Research Portal

A study of narrative creation by means of crowds and niches

Author: Aroyo Lora
Inel Oana
Sauer Sabrina
Publication venue: CEUR Workshop Proceedings (CEUR-WS.org)
Publication date: 05/07/2018
Field of study

Dissertations of the University of Groningen

A study of narrative creation by means of crowds and niches

Author: Aroyo Lora
Inel Oana
Sauer Sabrina
Publication venue: CEUR Workshop Proceedings (CEUR-WS.org)
Publication date: 05/07/2018
Field of study

Proceedings - University of Groningen

Crowdsourcing StoryLines:Harnessing the Crowd for Causal Relation Annotation

Author: Caselli Tomasso
Inel Oana
Publication venue: Association for Computational Linguistics (ACL)
Publication date: 01/01/2018
Field of study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Crowdsourcing StoryLines:Harnessing the Crowd for Causal Relation Annotation

Author: Caselli Tomasso
Inel Oana
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

University of Groningen

Empirical Methodology for Crowdsourcing Ground Truth

Author: Aroyo Lora
Dumitrache Anca
Inel Oana
Ortiz Carlos
Sips Robert-Jan
Timmermans Benjamin
Welty Chris
Publication venue
Publication date: 24/09/2018
Field of study

The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa

arXiv.org e-Print Archive

A checklist to combat cognitive biases in crowdsourcing

Author: Draws Tim
Gadiraju Ujwal
Inel Oana
Rieger Alisa
Tintarev Nava
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 01/01/2021
Field of study

ZORA